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Abstract 

Learning is facilitated by conversational interactions both with human tutors and with computer agents that simulate 
human tutoring and ideal pedagogical strategies. In this article, we describe some intelligent tutoring systems (e.g., 
AutoTutor) in which agents interact with students in natural language while being sensitive to their cognitive and 
emotional states. These systems include one-on-one tutorial dialogues, conversational trialogues in which two agents 
(a tutor and a “peer”) interact with a human student, and other conversational ensembles in which agents take on 
different roles. Tutorial conversations with agents have also been incorporated into educational games. These learning 
environments have been developed for different populations (elementary through high school students, college 
students, adults with reading difficulties) and different subjects spanning science, technology, engineering, mathematics, 
reading, writing, and reasoning. This article identifies some of the conversation patterns that are implemented in the 
dialogues and trialogues. 
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The positive impact of human tutoring on student learn- 
ing is a well-established empirical finding that has moti- 
vated policies in many educational systems. Meta-analyses 
that have compared tutoring to classroom teaching and 
other suitable comparison conditions have reported 
effect sizes between a = 0.20 and a = 1.00 (Cohen, Kulik, 
& Kulik, 1982; Graesser, D’Mello, & Cade, 2011). Effect 
sizes are computed by comparing mean test scores of 
participants in treatment and comparison conditions and 
dividing those means by the pooled standard deviations. 
Expertise varies substantially among tutors, who include 
student peers, students slightly older than their tutees, 
community volunteers, paraprofessionals, college stu- 
dents with some pedagogical training, and experienced 
tutors with substantial subject-matter knowledge and 
pedagogical training. Sometimes one tutor simultane- 
ously handles two or more students who are experienc- 
ing similar problems. 

The question of why tutoring is effective in helping 
learning is far from settled, but detailed analyses have 
been conducted on the discourse, language, facial expres- 
sions, gestures, and actions used in tutorial conversations 
(Graesser et al., 2011; Graesser, Person, & Magliano, 


1995). Researchers are currently investigating which con- 
versational components are likely to explain learning 
gains. Tutor effectiveness does not simply consist in lec- 
turing the student, which can be done in a classroom, but 
rather in attempts to get the student to construct answers 
and solutions to problems (Chi, Siler, Yamauchi, Jeong, & 
Hausmann, 2001). Surprisingly, tutor effectiveness cannot 
be attributed to a fine-grained diagnosis of what the stu- 
dent knows, to high shared knowledge (i.e., common 
ground; Clark, 1996) between the tutor and student, or to 
accurate feedback given to the student. Tutors have lim- 
ited abilities to diagnose student knowledge because 
their shared knowledge is minimal. Instead, most human 
tutors follow a systematic conversational structure that 
has been termed expectation- and misconception-tailored 
(EMT) dialogue. 
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What is EMT dialogue? Human tutors anticipate par- 
ticular correct answers (called “expectations”) and par- 
ticular misconceptions when they ask the students 
challenging questions (or present them with challenging 
problems) and trace the students’ reasoning. As the stu- 
dents express their answers, which are distributed over 
multiple conversational turns, tutors compare these con- 
tributions with these expectations and misconceptions. 
The tutors give feedback to the students that is sensitive 
to how well the students’ contributions match the expec- 
tations or misconceptions. The tutors also produce dia- 
logue moves to encourage the students to generate 
content and to improve their answers to challenging 
questions or problems. The following dialogue moves 
are prevalent in the EMT dialogue of most human tutors, 
including expert tutors: 

Short Feedback. The feedback is either positive (“Yes”; 
“Very good”; a head nod; a smile), negative (“No”; 
“Not quite”; a head shake; a pregnant pause; a frown), 
or neutral (“Uh-huh”; “I see”). 

Pumps. The tutor gives nondirective pumps (“What 
else?”; “Tell me more”) to get the student to do the 
talking. 

LJints. The tutor gives hints to get the student to do the 
talking or perform the actions, but directs the student 
along some conceptual path. The hints vary from 
being generic statements or questions (“What about 
. . . ?”; “Why not?”) to speech acts that more directly 
lead the student to a particular answer. Hints are the 
ideal scaffolding move to promote active student 
learning while directing the student to focus on rele- 
vant material. 

Prompts. The tutor asks a leading question in order to 
get the student to articulate a particular word or 
phrase. Sometimes students say very little, so these 
prompts are needed to get them to say something 
specific. 

Assertions. The tutor expresses a fact or state of affairs. 

Pump-hint-prompt-assertion cycles are frequently 
used by tutors to extract or cover particular expecta- 
tions. Eventually, all of the expectations are covered 
and the exchange is finished for the main question or 
problem. During this process, students occasionally ask 
questions, which are immediately answered by the 
tutor, and occasionally express misconceptions, which 
are immediately corrected by the tutor. Consequently, 
there are other categories of tutor dialogue moves: 
answers to student questions, corrections of student 
misconceptions, summaries, mini-lectures, and off- 
topic comments. 


Simulating Human Tutors With 
AutoTutor 

AutoTutor is an intelligent tutoring system that is designed 
to simulate the discourse moves of human tutors and also 
to implement some ideal tutoring strategies (Graesser 
et al., 2012; Graesser et al., 2004). In the program, there 
is an animated conversational agent that generates 
speech, facial expressions, and some gestures. 
Conversational agents have been increasingly popular in 
contemporary advanced learning environments (Biswas, 
Jeong, Kinnebrew, Sulcer, & Roscoe, 2010; Gholson et al., 
2009; McNamara, O’Reilly, Best, & Ozuru, 2006; Olney 
et al., 2012; Ward et al., 2013). All of the components of 
EMT dialogue can be implemented in AutoTutor because 
of advances in computational linguistics (Jurafsky & 
Martin, 2008) and statistical representations of world 
knowledge (Landauer, McNamara, Dennis, & Kintsch, 
2007). AutoTutor is capable of classifying the students’ 
contributions into different categories of speech acts, such 
as questions, statements, metacognitive expressions (“I 
don’t know”; “I see”), short responses (“Okay”), and 
expressive evaluations (“This is terrible”). AutoTutor 
responds adaptively, in a fashion that is sensitive to the 
students’ speech-act categories and the quality of their 
statements. 

We emphasize that AutoTutor cannot interpret all 
speech acts that students produce, but it can simulate 
EMT dialogue, and that does help students learn. As in 
the case of human tutors, AutoTutor has only an approxi- 
mate “understanding” of what a student knows and 
expresses, much of which is vague, ungrammatical, and 
semantically ill-formed. Nevertheless, AutoTutor can 
assess how well the student’s contributions in the various 
conversational turns match the expectations and miscon- 
ceptions by using semantic-pattern-matching algorithms 
that are sufficiently accurate. AutoTutor gives short feed- 
back that depends on the quality of student contributions 
in the previous turn. AutoTutor generates pumps, hints, 
prompts, and assertions to fill in missing words of expec- 
tations, following semantic-pattern-completion algo- 
rithms. AutoTutor can answer a subset of student 
questions, but students do not frequently ask questions 
in tutoring (Chi et al., 2001; Graesser et al., 1995), so this 
is not a major limitation. An example AutoTutor conver- 
sation is presented in Table 1. 

Empirical evidence supports the claim that AutoTutor 
and similar computer tutors that use natural-language 
dialogue yield learning gains comparable to those of 
trained human tutors, with effect sizes averaging 0.8 and 
ranging from 0.6 to 2.0 (Graesser et al., 2012; Graesser 
et al., 2004; Hu & Graesser, 2004; McNamara et al., 2006; 
Olney et al., 2012; VanLehn, 2011; VanLehn et al., 2007). 
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Indeed, direct comparisons between these computer 
tutors and human tutors have shown no significant differ- 
ences. These assessments have covered subject matters 
and skills in the areas of science and technology (e.g., 
physics, biology, computer literacy, and scientific reason- 
ing) as well as reading comprehension. The quality of the 
dialogue in AutoTutor is also reasonably coherent, 
although not perfect. In fact, the dialogue is sufficiently 
tuned so that a bystander who reads the tutorial dialogue 
in print cannot tell whether a particular tutor turn was 
generated by AutoTutor or by an expert human tutor. 

Several versions of AutoTutor have been developed 
since 1997, when the system was created. The conversa- 
tional agents with talking heads have been compared 
with conversational agents using speech alone, chat mes- 
sages in print, and multiple communication channels. 
Students communicating in text versus spoken utterances 
have also been compared. For the most part, it is the 
semantic, conceptual content in the conversation that 
predicts learning, not the medium of communication 
(D’Mello, Dowell, & Graesser, 2011; Graesser, Jeon, & 
Dufty, 2008; VanLehn et al., 2007). One version of 
AutoTutor, called AutoTutor-3D, guides learners on using 
interactive simulations of physics microworlds (Graesser, 
Chipman, Haynes, & Olney, 2005). An interactive simula- 
tion world with people, objects, and a spatial setting is 
created for each physics problem. Students manipulate 
parameters of the situation (e.g., mass of objects, speed 
of objects, distance between objects) and then ask the 
system to simulate what will happen. Students are also 
prompted to describe what they see. Their actions and 
descriptions are evaluated with respect to matching 
expectations or misconceptions. AutoTutor manages the 
dialogue with hints and suggestions that scaffold the 
interactive simulation experience. AutoTutor-3D yields a 
significant improvement in learning (o = 0.2) over the 
strictly conversational AutoTutor for those students who 
run the simulations multiple times rather than only once 
(Jackson, Olney, Graesser, & Kim, 2006). 

Another version of AutoTutor is sensitive to students’ 
emotions and responds with appropriate emotional 
expressions (D’Mello & Graesser, 2012). Students’ emo- 
tions are detected by the computer through sensing 
channels on the basis of dialogue patterns during tutor- 
ing, the content covered, and the students’ facial expres- 
sions, body posture, and speech intonation. The primary 
emotions that occur during learning with AutoTutor are 
frustration, confusion, boredom, and flow (engagement); 
surprise and delight also occur occasionally (Graesser & 
D’Mello, 2012). An AutoTutor that is supportive and emo- 
tionally empathetic toward low-knowledge students 
helps learning more than an AutoTutor that is not emo- 
tionally sensitive. 


Agents in Trialogues 

Versions of AutoTutor have recently been developed for 
group conversations. The incremental value of multiple 
agents is that the human learner can learn by observing 
how the agents interact. A student can learn vicariously 
by observing one agent communicating with another 
agent, showing how actions are performed, and reason- 
ing collaboratively with the other agent. Interactions of 
two agents with one human student are called trialogues. 
The two agents can disagree, contradict each other, and 
hold an argument, periodically turning to the student to 
solicit his or her perspective (Lehman et al., 2013). Such 
experiences put students in cognitive disequilibrium, 
which encourages problem solving, reasoning, and, ulti- 
mately, deep learning. This section describes some sys- 
tems using trialogues that help students learn in a social 
world. 

It is possible to extend the horizons of intelligent 
tutoring systems beyond trialogues. Researchers have 
developed systems in which a human student interacts 
with more than two agents, as well as systems in which 
one tutor agent serves several students in multiparty col- 
laborative learning (Kumar & Rose, 2011). Researchers 
are also developing multiple-agent configurations for use 
in high-stakes assessment environments, including those 
of the Educational Testing Service. However, it is beyond 
the scope of this article to cover these larger group con- 
figurations and high-stakes assessments. 

Trialogues with agents for adult 
literacy 

As an illustration, AutoTutor trialogues are currently 
being developed for the Center for the Study of Adult 
Literacy (CSAL; http://csal.gsu.edu/content/homepage). 
The goal is to help adults with low literacy read better so 
they can improve their lives. Figure 1 shows a snapshot 
of the interface for AutoTutor-CSAL. In this setup, there is 
a teacher agent (Cristina, top left), a student agent 
(Jordan, top right), and the human who interacts with the 
agents (Haiying). The program is a competition between 
the human learner and the student agent, with guidance 
from the teacher agent. The learner’s task is to apply a 
clarifying strategy to identify the meaning of words with 
multiple meanings in context (in this case, the two mean- 
ings of fan'). A window at the top shows a sentence in 
which the focal word (“fans”) is underlined. Below this 
sentence are two images and associated sentences that 
represent two meanings of fan. Two scoreboards at the 
bottom of the screen display the names of the human 
learner and the student agent and their current scores in 
the competition. 
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Why are movie stars so cool? Because they have so 
many fans . 


enthusiastic followers 


Haiymg 


an instrument for producing a current of air 


Fig. 1. Screen shot of conversational agents in an AutoTutor trialogue designed to help adult learners read. 


How are the trialogues played out? Cristina first asks a 
question and asks the human learner, Haiying, to choose 
the correct answer by clicking on one of the two answers. 
When clicked, the answer changes its color. Cristina then 
asks Jordan to choose the answer. When Jordan gives an 
answer, either agreeing or disagreeing with Haiying, 
Cristina gives the two of them feedback, shows the cor- 
rect answer in green, and updates the scores. In sum- 
mary, the agents navigate the human learner through the 
experience, present him or her with competition, give 
feedback and explanations for why answers are correct 
or incorrect, and keep score on points in the competi- 
tion. Across all of the lessons in CSAL, both agents and 
human learners play many roles: teacher, helper, collabo- 
rator, ideal model, judge, and critic. 

Trialogues with agents in scientific 
reasoning 

A detailed example of a trialogue is presented in this sec- 
tion. This trialogue is based on an instructional game 
initially called Operation ARIES! (Millis et al., 2011), 
which was subsequently commercialized by Pearson 
Education as Operation ARA (Halpern et al., 2012). ARIES 
is an acronym for Acquiring Research Investigative and 
Evaluative Skills, whereas ARA is an acronym for 
Acquiring Research Acumen. The game teaches high 
school and college students how to critically evaluate 
aspects of scientific investigations, such as the need for 


control groups, the distinction between correlation and 
causation, and operational definitions. In one game 
phase, the two agents and student collaboratively answer 
difficult questions based on a text the student has read. 
In another phase, the trialogue involves critiquing case 
studies of scientific research with respect to scientific 
methodology. 

Table 1 shows a conversation that demonstrates an 
EMT trialogue in Operation ARA. Tom is the name of the 
human student, whereas Dr. Quinn is the teacher agent 
and Glass is a student agent. Dialogue-move categories 
are shown in brackets and italics. Sometimes Glass gives 
incorrect answers, whereas Dr. Quinn has the ground 
truth. The trialogue encourages Tom to fill in information, 
answer questions, and resolve disagreements. Therefore, 
the human learner is an active participant rather than 
merely a passive, vicarious observer. 

The trialogues in ARIES and ARA have been shown to 
help students learn (Halpern et al., 2012; Millis et al., 
2011), but we are still exploring what characteristics of 
the conversational interaction account for this learning 
(Eorsyth et al., 2012). We do know that AutoTutor dia- 
logues and trialogues increase learning substantially 
compared with reading texts on the content for an equiv- 
alent amount of time (Graesser et al., 2004; Kopp, Britt, 
Millis, & Graesser, 2012; VanLehn et al., 2007). However, 
we need to better understand why and how. 

In closing, this article has described how AutoTutor 
dialogues and trialogues help students learn by engaging 
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Table 1. Example Trialogue Conversation With Operation ARA 


Dr. Quinn: I really think we should talk about this topic a little further. 

Dr. Quinn: Why do you think it is that you can’t make a causal inference when you do a correlational study? [Main question] 

Tom: Because the data aren’t collected scientifically. 

Glass: I agree Tom. 

Glass: It is hard to make a causal conclusion with correlational research because the data aren’t collected scientifically. 

Dr. Quinn: That is not right, Glass. [Negative feedback] 

Dr. Quinn: The data are collected scientifically in correlational research. This is not the problem. [Correction [ Think about this 
some more. [Pump] 

Glass: So, what would be the reason that causal inferences are not appropriate when doing correlational research? This is making 
me think about how it is that participants are assigned to groups in true experiments. [Peer hint] Tom, can you help me with 
this question? 

Tom: It has to do with the assignment of the participants. 

Dr. Quinn: O.k. [Neutral feedback] 

Dr. Quinn: Causal inferences can really only be made in true experiments when participants are assigned to groups in what 
way? [Prompt] 

Tom: Randomly. 

Dr. Quinn: Great! [Positive feedback] 

Dr. Quinn: When participants are assigned to groups randomly. [Correct prompt completion] 

Dr. Quinn: Think about this. 

Dr. Quinn: Remember, you’re trying to explain why it is that causal inferences are not appropriate when doing correlational 
research. It might help you to think about the various possible explanations for a significant correlation, and then answer the 
question. 

Tom: I think it has to do with the sample size. 

Glass: No, I don’t think that is the answer. 

Glass: It seems to me that with correlational research, relationships are identified, but it is hard to know which variable caused 
the change in the other variable . . . 

Dr. Quinn: In summary, correlations tell us about relationships . . . [Summary] 


Note: In thi.s example trialogue, Tom i.s the human learner, Dr. Quinn is the tutor agent, and Glass is the student agent. Dialogue-move categories 
are annotated with italics in brackets. 


them in conversations in natural language. These com- 
puter agents exhibit conversation patterns of human 
tutors in addition to ideal pedagogical strategies. Learning 
is facilitated by these conversational-agent environments, 
compared with students reading textbooks and engaging 
in other non- interactive learning environments. Perhaps 
this is not surprising, given that human tutors are also 
more effective than classrooms and textbook reading. 
Indeed, learning has occurred for millennia in appren- 
ticeship contexts in which the learner communicates 
with the tutor, master, or mentor in natural language. This 
is the moment in history when researchers in computa- 
tional linguistics, artificial intelligence, intelligent tutoring 
systems, discourse processes, and the learning sciences 
are simulating many of these conversation patterns. The 
agents are not perfect simulations of humans, but they 
are good enough to help students learn. 
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